Mining Empirical Data to Improve On-Line Mathematical Character Recognition

نویسندگان

  • Elena Smirnova
  • Stephen M. Watt
چکیده

This chapter describes methods to increase the accuracy of mathematical handwriting analysis by using context information. Our approach is based on the assumption that likely expression continuations can be derived from a database of mathematical expressions and then can be used to rank the candidates of isolated symbol recognition. We present how predicted continuations for an expressions are derived, how they are combined with the recognition candidates, and the effectiveness of the results. We first review the techniques we have used to build and represent a mathematical context database. We then describe different strategies for combining context information with results obtained from the recognition of individual characters. Finally we present a summary of a case study, using a fixed dataset of common mathematical expressions to test the accuracy of on-line analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating AHP and data mining for effective retailer segmentation based on retailer lifetime value

Data mining techniques have been used widely in the area of customer relationship management (CRM). In this study, we have applied data mining techniques to address a problem in business-to-business (B2B) setting. In a manufacturer-retailer-consumer chain, a manufacturer should improve its relationship with retailers to continue its business. Segmentation is a useful tool for identifying groups...

متن کامل

Combining Prediction and Recognition to Improve On-Line Mathematical Character Recognition

This paper describes methods to increase the accuracy of mathematical handwriting analysis by using context information. Our approach is based on the assumption that likely expression continuations can be derived from a database of mathematical expressions and then can be used to rank the candidates of isolated symbol recognition. We present how predicted continuations for an expressions are de...

متن کامل

Operative assessment of predicted generalization errors on non-stationary distributions in data-intensive applications

Data-intensive applications use empirical methods to extract consistent information from huge samples. When applied to classification tasks, their aim is to optimize accuracy on unseen data hence a reliable prediction of the generalization error is of paramount importance. Theoretical models, such as Statistical Learning Theory, and empirical estimations, such as cross-validation, can both fit ...

متن کامل

Line-touching character recognition based on dynamic reference feature synthesis

In recognizing characters written on forms, it often happens that characters overlap with pre-printed form lines. In order to recognize overlapped characters, removal of the line and restoration of the broken character strokes caused by line removal are generally conducted. But it is not easy to restore the broken character strokes accurately especially when the direction of the line and the ch...

متن کامل

A Study to Improve the Response in Email Campaigning by Comparing Data Mining Segmentation Approaches in Aditi Technologies

Email marketing is increasingly recognized as an effective Internet marketing tool. In this study, a questionnaire is constructed and distributed to a sample of 146 prospects of Aditi Technologies to find the factors associated with higher response rates. The collected data is analyzed using Factor Analysis and the 11 factors, From Line, Subject Line, Personalization of the subject line, Timing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008